-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runc create/run: warn on rootless + shared pidns + no cgroup #4398
Conversation
@@ -42,6 +42,7 @@ func Validate(config *configs.Config) error { | |||
// Relaxed validation rules for backward compatibility | |||
warns := []check{ | |||
mountsWarn, | |||
rootlessSharedPidns, // TODO: make it an error in runc 1.3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to error out if we implement walking the process tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to error out if we implement walking the process tree
To me it looks neither possible nor desirable.
Not possible because
- init process might be gone already (with some other processes still running);
- walking the tree requires freezing all the processes, as otherwise the walker will be racing with forks;
Not desirable because
- the code will probably be very slow and resource hungry, thanks to text-based nature of /proc;
- cgroup v1 is going to be obsolete (one day it will; fingers crossed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the only possible way to implement this process tree walking nicely (i.e. not slow and resource hungry) would be via ebpf (which has access to kernel-internal data structures), but the kernels supporting it are probably running cgroup v2 already. Even with ebpf, other issues cited above remain.
e62e4be
to
0980635
Compare
0980635
to
2eb28b5
Compare
@AkihiroSuda I think we need more than was done in #4395. First, I am a bit puzzled why we see
from Second, I guess we need to add a special case to WDYT? |
Maybe we'll also need to call |
15b4383
to
4833305
Compare
Mainly because of ‘delete -f’ in teardown? runc/tests/integration/helpers.bash Lines 726 to 737 in 4833305
|
4833305
to
c2b04a3
Compare
You are right; I was sure messages from |
c2b04a3
to
fd38d2d
Compare
OK, validation check can't work right as it does not know whether cgroup is actually accessible. Need to log a warning later when manager.Apply fails. Reworked this PR to do just that. |
2c48e0a
to
7168d59
Compare
e930bf9
to
5c864e9
Compare
5c864e9
to
805f27c
Compare
@@ -580,7 +580,18 @@ func (p *initProcess) start() (retErr error) { | |||
// cgroup. We don't need to worry about not doing this and not being root | |||
// because we'd be using the rootless cgroup manager in that case. | |||
if err := p.manager.Apply(p.pid()); err != nil { | |||
return fmt.Errorf("unable to apply cgroup configuration: %w", err) | |||
if errors.Is(err, cgroups.ErrRootless) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the purpose of warning, I think this PR LGTM.
But further more, I think we should serialize this to state.json
, for example with a field name noCgroup
, it is useful for doing runc kill
.
Please see: #4395 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was thinking about it, too, but let's do improvement at a time, shall we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we are trying to cut 1.2.0 for some time now, and this PR is a result of a recent development in the area (a regression described in #4394 (and fixed by #4395). As I noted in #4394 (comment), I'm OK with the fix in #4395, but it would be nice to also introduce a warning; this is what this PR does.
I think we can introduce noCgroup
in runc 1.3. Feel free to open an issue about it so we won't forget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
if errors.Is(err, cgroups.ErrRootless) { | ||
// ErrRootless is to be ignored except when the | ||
// container doesn't have private pidns. | ||
if !p.config.Config.Namespaces.IsPrivate(configs.NEWPID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not put this condition in the previous if? Do you think it is more readable like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that initially, but rolled it back later since changing the code like this complicates the review (the whole block changes instead of just one line).
I will add a separate commit that does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad; I mixed this up with something else. Updated; PTAL @rata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ughm, I just broke everything.
This is two if
statements (and an else
) because we want to
- ignore ErrRootless;
- except there's no private pidns, in which case we want a warning;
- return all other errors as is.
No way to do that with a single if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I tried to collapse it and I'm not sure it is more readable than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a question of readability. There are three different conditions, can't do it with a single if.
@AkihiroSuda PTAL |
805f27c
to
3ff4bd3
Compare
3ff4bd3
to
bb6c848
Compare
bb6c848
to
3e79e53
Compare
In these cases, this is exactly what we want to find out. Slightly improves performance and readability. Signed-off-by: Kir Kolyshkin <[email protected]>
This aids in failed test analysis by allowing to distinguish the output of various commands being run as part of the test case from the output of teardown command like runc delete. Signed-off-by: Kir Kolyshkin <[email protected]>
Shared pid namespace means `runc kill` (or `runc delete -f`) have to kill all container processes, not just init. To do so, it needs a cgroup to read the PIDs from. If there is no cgroup, processes will be leaked, and so such configuration is bad and should not be allowed. To keep backward compatibility, though, let's merely warn about this for now. Alas, the only way to know if cgroup access is available is by returning an error from Manager.Apply. Amend fs cgroup managers to do so (systemd doesn't need it, since v1 can't work with rootless, and cgroup v2 does not have a special rootless case). Signed-off-by: Kir Kolyshkin <[email protected]>
3e79e53
to
30f8f51
Compare
@@ -129,14 +129,15 @@ func (m *Manager) Apply(pid int) (err error) { | |||
// later by Set, which fails with a friendly error (see | |||
// if path == "" in Set). | |||
if isIgnorableError(c.Rootless, err) && c.Path == "" { | |||
retErr = cgroups.ErrRootless |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is too strict? Maybe we should add a condition here:
if name == "devices" {
retErr = cgroups.ErrRootless
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't really matter here because there can't be a situation where devices cgroup can't be created while others can.
// the container doesn't have private pidns. | ||
if !p.config.Config.Namespaces.IsPrivate(configs.NEWPID) { | ||
// TODO: make this an error in runc 1.3. | ||
logrus.Warn("Creating a rootless container with no cgroup and no private pid namespace. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about s/no cgroup/no devices cgroup/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such message will have a negative effect, as it will be more confusing to a user (especially in cgroup v2 case).
Shared pid namespace means
runc kill
(orrunc delete -f
) have tokill all container processes, not just init. To do so, it needs a cgroup
to read the PIDs from.
If there is no cgroup, processes will be leaked, and so such
configuration is bad and should not be allowed. To keep backward
compatibility, though, let's merely warn about this for now.
Alas, the only way to know if cgroup access is available is by returning
an error from
Manager.Apply
. Amend fs cgroup managers to do so (systemddoesn't need it, since v1 can't work with rootless, and cgroup v2 does
not have a special rootless case).
Related to #4394, #4395.